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A Mettod for Mapping 

Substrate Molecules 



Field of Invention 



Th. mvenuoo is tan=dia.c,y „iev=„., ,„ medial n=.d= ,„.,„di„. ,„fl^,,„ 

aa»umx,u^n,, »„.pW,e„. c=„c=r, :.u.nn=robials, v,„i„gv, „,euboiic disease ^^d 
a^lcTB,. M=4ods to idenu^ 3dc=:,v= subsc.,es of spcciiic =„.^es are .ndicaced, based c„ 
me ce:a.led mapping of the subs^e b,ndi„, sue us.P.g comb,...oria, pendde Ubraries 
These et^zymes can be any „„,e=„,e ,ha, eovaie„„y n,„dif.es „s physiolodcal subsnate 
.arse, =.x„p,es of »hich mdnde. bu, are aot UmKed Co, pr„,e,n ki^^es, orcein 
phosphau.es, aoetyiases and Hbosylases. These derived sobs:ra.e-based conaoounds can 
s^e as a basis for further =.edi.inal che™.sdy developnten, of seiecve enzMr- 
.^ottots. The identincadon of short pept,dic substrates t^ing this ntethodolo^ wi,i Wo" 
allow for the rapid developtnent of high ttaoughput screens for compotmd screemng 

Background to Invention 

The nttpping ntethod is cxenaplined using ntentbers of the protein lanase en^e fcnilv' 
butttas method is appacable to other covalemly ntodi^g enzymes. 

Pho^^hate transfer (phosphorylation) is the most connnon fonn of covalent orctein 
moau,=at.on used by ceUs. Protein Kntases are the eazymes that catalyse d,e cansfe; of the 
yphosphate from adenosine trtphosphate (ATP) to an annno acid residue (usually r^sin- 
tecmne. serine or histidine) on a subs^ate molecule. ..pproximateiy 400 kinases 
currently lotown, and i, is ffltely that this number wiu in^^ considerably in the nex- feJ 
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years as more information from gene sequencing databa.es becomes available 
FuncuonaUy. ^hese molecules are intnrcelluiar enzymes that play key roies in cell arowth, 
differentiation and inter-cell communication. Abetrant protem kinase activity has be-n 
unphcated in many disease states mcluding several fonns of cancer and severe-combine^ 
umnunodeficiency disease (Barker et ai 1997; Lehtola er al; Arpaia et al 1994; Elder er 
1995; Roifeian. 1995). Similarly, activation of protem kmase activity within mononuclear 
ceils IS required to drive the cytokine production which underlies many autoimmun- 
diseases (Lee et al, 1994). Thus, inhibitor compounds capable of specincallv inactivatire 
certain cntical kinases may have considerable therapeutic benefit in a number of clinical 
diseases. 

All tymsine and serine/threonine protem kinases have a region of approximatelv 300 amino 
acids known as the catalytic subunit which has evoived from a common ancestor kinase 
(Hanks and Quinn, 1991). Crystal stracture determmation of several kmases has sho^vn that 
they aU have a common bi-lobal smicture (Wilson et al, 1996; Zhang et al. 1994; Xu e: al 
1997). THe amino-tenninai pan of the subunit encodes a small lobe rest^onsible for the 
binding of ATP. whereas the carboxy-temunal residues encode a larger lobe imponant for 
protem substrate binding. In the ternary structure of the active kinase, both the ATP and 
the protein substrate binding, sites are brought together allowing transfer of the ATP v. 
phosphate to the amino acid acceptor on the protein substrate. The protem/peptide faindin'. 
groove stretches across the face of the large lobe between two a-heiices and under thi 
smaU lobe. This groove therefore contains the residues important for dennins the substrate 
specificity of the kinase. 

Many protein kinases are arranged in kinase cascades within the cell, providing the abiiirv 
for signal amplification in post-transduction pathways. This amplification relies on the 
upstream kinase specifically activating its downstream parmer. For this reason, protein 
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kinases have developed remarkable substrate specificities which prevent unwanted 
crosstalk between different kinase cascades. This substrate specificity can be exploited in 
the design of selective protein kinase inhibitors. 

A technique has recently been developed by L. Cantley's laboratory to provide, consensus 
peptide protein kinase substrate information (WO 95/1 8823, Songyang et al, 1994). First, a 
degenerate library of peptides with a phospho-acceptor such as tyrosine or serine-'threonine 
flanked by amino acids on each side is synthesised. A preferred number of degenerate 
residues on each side of the phosphorylation site is four (corresponding to positions -1, -2, - 
3, -4, H-i, +2, -3, +4) relative to the phosphoryiated residue. Thus the library consists of 
peptides having a length of nine amino acids. Tne library is then phosphoryiated by the 
protein kinase of mterest and phosphoryiated peptides isolated from the non- 
phosphorylated peptides by DEAE-sephacel and ferric chelation chromatography. Tne 
phosphopeptide mixture is then sequenced and the frequency of each amino acid at eveiy 
position assessed to give a preferred substrate sequence. These studies have yiefded 
consensus substrate infoimation, but do not allow a detailed analysis of particular 
preferences for neighbouring residue interactions as pools of peptides are examined. 
Furthennore, this type of analysis may not show up rare good substrates which could be 
hidden by the presence of numerous poor substrates in the peptide pool. By this method 
individual peptides can never be identified as individual sequences, the result is thai an 
average picmre of substrate specificity is reached. Part of the problem is that each 
individual peptide is represented at such a low level, and many inevitably will not even be 
present. The results from Cantley's method do not represent individual peptides but a 
consensus picture of protein kinase substrate specificity. 

FUamentous phage expressing gene m-lmked degenerate peptide sequences have also been 
used to generate substrate informauon (Schmitz et al, 1996; Dente et al, 1997), however 
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.to m=« ,s labou. intensive and doe. „=, aUow .he us. of „^ amin„ ^ „ 
pepucon^encs. Subs^e ™dcn ™ also be obtained fon, knowledge of U,e 
phys,olo.cal icinasc subs^es. Tnis app^aeh is tai,ed and previous anen«J„ 
^ ™uo„ for ,he design of success^., Aerapeuuc cell pe^eoble p„.ein «nase 

innibuors has railed (Kemp et aL 1991). 

We believe a,a, .den^flearion of detailed s„bsna,= cha,act«i„i=s can u,dn.a«lv ntac ^ 
subs™ binding _^„ve and prov,de info™ .ba, can lead „ th. desis. of enzyn,- 
.nh,b„„r .noiecoles. For .he reasons described above, tber, are no curxen," nrethods for 

Obtain., tbis ™.„„. Therefore, we bave invented a .edtod of using snnU molecule. 

u. a self oeconvolutnxg library fot^at. to probe a larg. active site by posiUonal scanning o] 

a targe, group. TOs tnerbod is rapid, not labour mtensive and results in the idennflcation' of 

discrete sequences. 



Summary of Invention 



This invention provides for the activ. .ite m^nnina 

serine. ^stid^, aspanlc acid ,«idue or a^/ol re ' 

appmpriate side chain is modified The h r . ' 
cotnplexity over and abov ^ ^f 1^ 1 °^ 
W097/4.,,6 an. R , / selfsleconvoluting libraries described in 

w..:;;,::::::: ' — -i„ b, refers. 



■Ms involves ntaidng a libraty of sntaller libraries (referred ,„ as sub-sets> where a 
"Stdu, is moved stepwise tbrough the sequence of amino aces or other groups (such 'as 
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pentidomimedcs [any compound that can be added to the substrate or inhibitor chain]). The 
result is chat in each sub-set of the library the fixed residue is found in a difierem position. 
For example, in a library using four variable positions, five sub-sets in each librarv have to 
be made, ZXXXX, X2XXX, XXZXX, XXXZX and XXXXZ where Z is the fixed residue 
and X are the four variable residues. We recognise that there may be a need in certain 
circumstances for further invariant residues, however these would occuny fixed positions 
and would not affect either the scanning or the self-deconvo luting of the libraries. The 
invariant residue(s) might be fixed m position relative to the modifiable residue Z or mav 
be freed m position relative to the overall motif sequence. Additional fixed residues can be 
added if desired, or one of the variable residues can be made invariant. In the later case the 
library would be a small part of the libraries described here. Cases where it is desirable to 
include one or more fixed residues mciude libraries required to look at enzvmes which 
always require another invariant residue in another position. However, in cases where two 
fixed residues are required, and they are both modified, it con be desirable to include this 
residue in one of the variable positions (i.e. make it one of the residues chosen in a variable 
position). The reason for this is that the sequence of events (the order in which the two 
residues become modified) can then also be probed by this scanning library technique. In 
this case It may also be beneficial to make an additional librarv' in which the fixed residue is 
not present at all, corresponding to the hbrary XXXX. We would therefore have a Ubrary of 
six sub-sets. These modifications are within the scope of the invention and would be 
recognised by someone skilled in the art. 

It can be readily seen that by combining the data from each library sub-set. the residues 
from -4 to +4 either side of the catalytic residue can be mapped: 



A-B-C-D-Z 
B-C-D-Z-E 
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C-D-Z-E-F 
D-Z-E-F-G 
Z-E-F-G-H 



The mapped sequence would therefore be A-B-C-D-Z-E-F-G-H. 

me above example using 5 subsets of libnmes of peptides of 5 amino acids allows tr» 
mappu:g of a sequence of 9 a„.ino acids. In general one could cany out the invention us J 
n sunsets of n-mer peptides so as to provide naapping data for the residues from -(n-1) to 
-(n-1) euher s:de of the active site. Thus in general the length of the mapned sequence 
would be (2n)-l. ^ 

Where the residue type at any given position relative to the xixed residue is similar in 
a:f:erent subsets, the data can be used in an additive manner. For example, if an aromatic 
resiaue :s required adjacent to the fixed residue, then any sequences which contain this 
feature m any of the library subsets can be considered in an additive way. 

In this mvemion there is no need to separate modified from unmodified seauences becaus» 
of the self deconvolutmg nature of the library. The assay screen produces a series of hits 
the patterns of which reveal the unique sequences in each well. Tins enables a pattern of 
substrate preferences to be determined for any enzyme. 

The unique sequences obtamed using this invention can be used to provide substrates fo^ 
tugn throughput assays and provide detailed information about the active site to aid rational 
drug design. 
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This invention can also be used as an inhibitor library to screen against known modiiVina 
enzymes where a known substrate exists and can be set up in an assay format. Those skilled 
in the an would realise that by replacing the fixed residue ^v;th a suitable compound that is 
not modified an inhibitor library con be constracted. For example if a modifiable fixed 
tyrosine were to be changed to a tyrosine derivative residue chat cannot be phosphoiylated. 
such as halogenated tyrosine, dopamine, or t>Tosine substituted by aromatic compounds, 
then an mhibitor lifaraiy will be formed. This could allow the more direct identification of 
prototype inhibitors of enzymes for rational dma design. 

Use of these libraries could be extended to other systems where a denned endpoint is 
desired, but the target enzyme is unknown. Such e.xam?les could include, but are not 
limited to, bactenal lysis in growing cultures or inhibiting phosphorylation of transcription 
factors in cell lysates. 

In one embodiment of this invention the sequences identified by this method are 
considerably smaUer than have previously been reponed for Ubrary screens on protein 
kinase substrates, which makes them more amenable to computer modelling and drug 
design. Furthermore, this methodology provides information about the relative relationshins 
between neighbouring residues of active substrates; infoimation which is not available from 
a straightforward oriented degenerate peptide library approach used by Canlley (Songj-ang 
et al, 1994). Thus, this novel methodology provides a significant improvement in the 
quality of substrate based infoimation that is achievable, in comparison to that produced 
from previously described methods. 

This invention allows data to be obtained from single peptide rankmgs which could be used 
to rationally design sets of enzyme inhibitor molecules which compete with the 
physiological subsffate for binding to the active site of the enzyme. 
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Description of the Drawings 

Figure 1. Design of protein tyrosine kinase Ubrary. Each peptide consists of a biotin tag, an 
epsiion amino hexanoic acid spacer and 5 amino acids, including a phosphoryiatable 
tyrosine residue. Eacli of ±e amino acid positions A-H is varied as described. For example. 
A 1 - 1 0 means that position A is varied using 1 0 denned amino acids. 

Figure 2. Best substrates identified by screening tyrosine library sub-se'^ 1 to 5 against 
ZAP -70 protein tyrosine kinase. Tne protein tyrosine kinase library described in Figure 1 
was phosphorylated for 30 minutes at 30°C using the catalytic domain of human ZAP-70. 
Peptides were capmred using sirepavidin-coated 96 well plates and phosphotyrosine 
detected using anti-phospho tyrosine antibody. anti mouse IgG-HRP and 
letramethylbenzidine (see experimental methods). Best substrates were identified as those 
which gave the highest amount of phosphate incorporation. 

Figure 3. Km Deteraiination of Biotin-€AH.A.-DEcDYFE(Nie) [SEQ ID NO. 3]. Tne 
catalytic domain of human ZAP-70 was used to phosphoryiate varying concentrations of 
peptide for 10 minutes at 30»C in the presence of "P-y-.A-TP. Peptide capture was 
perfoimed using strepavidin filter plates, scintillation fluid added, and counting peribnned 
using a beta-counter (see experimental methods). Samples were assayed in triplicate. 

Figures 4 to 17. Component distributions in the plates of a library matrix. 
Description of the Invention 

This invention provides for the active site mapping of enzymes which catalyse covalent 
modification including, but not limited to, phosphorylation, acylation. dephosphorylation 
in which a residue such as a tyrosine, serine, threonine, histidine, aspartic acid or anv 
other residue containing an appropriate side chain is modified. 
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Tnus .vennon provides a n:ed:od for detennining an amino acid sequence motif or a 
pep.dom:xnetic sequence motif containing an active site capable of being bound by an 
enzyme which catalyses covalenc modification of a substrate molecule, comprising- 
a) contacting the enzyme with a Ubraiy consisting of a number of'oriente^ 

degenerate library subsets of molecules, each subset comt^risir. 
umnodined degenerate motif sequences each having n residues and each 
havmg a modifiable residue at a different fixed non-degenerate position 
under conditions which allow for modification of molecules which are a 
substrare for the enzyme; 

b) allowing the enzyme to modify modifiable residues in library subsets 
containing molecules having an active substrate site for the enzyme; 

c) deconvolutmg the oriemed degenerate library subsets of the library', m situ 
without separatmg modified from unmodified molecules, so as to reveal 
the sequence of any motif which has been modified by covalent binding of 
the enzyme; 

wherein each hbrary subset is of formula (I) 

(Xaa)^ Zaa (Xaa)y ([) [SEQ ED No. I ] 

wherein 

Zaa is a non-degenerate modifiable natural or umiamral amino acid residue or 

peptidomimetic; 

Xaa is any namral or mmatural amino acid residue or peptidomimetic; 
X and y are each independently 0 or an integer; 
(x^y) = (n-I);and 

n = an integer from 3 to 8, preferably 5. 
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Ths invention can be appUed for instance to a protein tyrosine kinase in order to exempiifv 
the technology. It provides a rapid method of identifying discrete protein kinase substrate 
sequences which allows pharmacophore generation and design of active site inhibitors This 
mvennon can also be used to directly identify protein kinase inhibitor molecules. General 
formula: (Xaa)., Tyr (Xaa)y [SEQ ID No. 2]. 

In the nrst exemplificaticn of this invention, a recombinant form of the human ZAP-70 
enzyme was used in an in virro phosphorylation reaction to phosphorylats the five substrate 
sub-libraries which scan the sequence -4 to ^4 around a central tyrosine residue (Figure 1) 
The libraries were aimnged in 96 well microtitre plate format with pools of 20 peptides in 
each m:cro::tre well. However, those skiUed in the an will realise that the library can be 
constructed on any scale. For example the Ubrary Sub-Set can be miniaturised on a "chip- 
scale or constructed on a large bulk scale depending on the requirements of the Ubraiy. 

Library peptides were made with biocn tags, which allowed peptide capture op 
strepavdm-coated microtitre plates. Detection of phosphotyrosine was achieved usino 
ant.-phosphotyrosine antibody detection in an ELISA assay using tetramethylbenzidini 
substrate and recording absorbance at 450 nm. Background absorbance readings of 0 1 to 
0.2 were recorded while the highest substrate peptide value was 1.5. Deconvolution of the 
hit peptides was performed as described in WO 97/42216 and Example 5. Clear defined 
substrates were deconvoluted in library sub-sets I to 4, but not in 5. This prbbahK 
reflects the absolute requirement of ZAP-70 for an amino acid residue in the -1 position.' 

For the pu.'pose of this exempiincarion, the peptides used were tagged with d-Biotin and a 
hnker (epsiion amino hexano.c acid or some other spacing group). In principle any tag and 
hnker can be used, although this invention also provides that a tag and linker does not have 
be presem if mass spectroscopy, for example, is used to identify the peptide hits. The 
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purpose of the tag is solely to enable capture of all of the peptides (whether modified or 
not) so that excess reagents can be washed away. The reporting systems to derect oentide 
modification can include, but are not limited to, antibody recognition, radioactive assay or 
mass spectroscopy. 

In the libraiy used to exemplify the invention, a Biotin tag.was chosen because we believed 
that this would give hnproved results. The reasoning for this choice of tas was because of 
the high level of positively charged groups on the enzyme in the area in which the ras sits. 
This charged area would cause unfavoumble interactions with tags more commonly uie bv 
others in the field, such a poly-lysme or poiy-arginine. We would expect this reasonins to 
be applicable to any enzyme which binds a highly negatively charged molecule such as'but 
not limited to ATP, close to the peptide binding site. 

Tags are preferably non peptidic, with as little charge, either positive or negative, as 
possible. Biotin is a good example of this. The aim is to minimise the interactions of the 
tag with the protein so the resultant hits are largely due to the binding of the peptides rather 
than reflecting the binding of the tag. Tne best method of ail if this argument is ai,plied to 
Its logical conclusion would be to not use a tag at all and use mass spectroscopy to identify 
the peptides. However, currently this approach is of limited value due to the time taken to 
run and analyse a Ubraiy of the size used here to exempUfy the invention. 

The results obtained from the library screen clearly demonstrated amino acid residues 
preferred by the protein kinase at each of the -4 to +4 sites (Figure 2). The 5 mer peptides 
overlapped to give information on amino acid preference at each of the binding positions -4 
to +4. To confirm this a consensus peptide, Biotin-£AH.VDEEDYFE(Nle) [SEQ ID No. 
3], representing the best -4 to ^4 amino acids was made and tested as a substrate (Figure 3). 
Tliis substrate gave a Km against ZAS-IO of 15.79 pM, which is better than the bes't ZAS- 
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Word...\39200PCTJO-30'* October 1998 j, 

70 substraie described in the liieramro o i„„„„ • • , 

perfonned for *e ZAP.70 library nl^ T - P-«io"sly 

substraK and recording abs„rba„ce « 450 „™ R t ■'^-"•Mbcnzidine 
were ..orded ...e .e .,e. ^b^ ;:rvr:;r^^^^ - 
P=*nned as descHbed in WO 97/4.^,6 ' '''r °' 
s.bs,ra,es were deco„vo,„,ed in a„ ,ib.ry .«b-.,e,,/ 

in Jha U,,rd app,ica,ion of .bis ,„.e„,io„, a ^„„bi„a„. fon. of ,be Hu„.„ CSK e„.y,„e 

p.*.edfor.bez..7o,,brar..oj:o::c::r^^^^^ 

phospholyrosine aniibody delecion in an ELr<!A cmeved usmg an.,- 

subscra,ea„d..ordin,absorba„ce r«olTt 

"oroance at 450 nm. Background absorbance readin-s of 0 OA 

were recorded wbii. ,be bigbes, s„bs«„ pepdde va,„e was 0.22. DeconvlX LT 
h.. pep„des was performed as described in WO 57/47^,6 and P "?™'""''" °' *' 
-bs<ra,es were deco„vo,n,ed i„ a„ iibrar, sub-scs 

^urtb appHcarion of .Ms invendon, a reconU^inan, fbnn of *e Abelson .nrine 
.«Uen..a prce. tyrosine «nase v-Ab> was .ed i. an „ ... pbosohorvll: 
«^non .0 pbospboryiafe d.e .brary sub-se. 4 wbicb scans ^ se,^ce ro ^3 a^^d 
.ero pos,uon tyrosine r.idue, as previously perfonned for d.= Z.^.70 library. Derecd' o 
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m an 



phosphotyrosine was achieved using anti-phosphotyrosine antibody detection 
ELISA assay using tetramethylbenzidine substrate and recording absorbance at 450 nm 
Background absorbance readings of 0.11 were recorded while the highest substrate 
peptide value was 0.32. Deconvolution of the hit peptides was performed as described in 
WO 97/42216 and Example 5. Clear defined substrates were deconvolved in the library 
sub-set. 

In the fifth application of this invention, the invention was used to map the substrate 
specificity of a protein serine or serine/threonine kinase (which include I-kappa B kinase 
beta and cAMP-dependent protein kinase [cAPK]). A protein serine or serine/threonine 
kinase enzyme was used in an in vitro phosphorylation reaction to phosphorylate the five 
substrate sub-libraries which scan the sequence -4 to +4 around a central serine residue. 
The library was synthesised as the protein tyrosine kinase ZAP-70 library save that the 
tyrosine fixed residues were replaced with a serine which was then scanned through the 
five sub-libraries. Detection of phosphoserine was achieved using anti-phosphoserine 
antibody detection in an ELISA assay using tetramethylbenzidine substrate and recording 
absorbance at 450 nm. Deconvolution of the hit peptides was performed as described in 
WO 97/42216 and Example 5. 

General fonnula: (Xaa), Ser (Xaa), [SEQ ID No. 4] 
Library Sub-Set 1 Xaa-Xaa-Xaa-Xaa-Ser 
Library Sub-Set 2 Xaa-Xaa-Xaa-Ser-Xaa 
Library Sub-Set 3 Xaa-Xaa-Ser-Xaa-Xaa 

Library Sub-Set 4 v c , 

Xaa-Ser-Xaa-Xaa-Xaa 
Library Sub-Set 5 c.. v v 

Ser-Xaa-Xaa-Xaa-Xaa 
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It w:ll be realised by those skilled in the an that the replacement of the recognition residue 
sucn a. the t>Tosme, or serine, with a residue that is covalently modified bv the enzvme to 
be mapped allows the active site of any such enzyme to be detennined according to the 

invention. 



The invention will now be described by reference to the following examples. 
Example 1. 

In this example the invention was used to map the active catalytic site of a protean 

bnase enzyme that catalyses the phosphorylation of a tyrosine residue. The exaxrple 
illustrates the synthesis of a number of compounds, and their use as a sub-set library for the 
mapnmg of the enzyme so as to allow the subsequent identification and synthesis of sinsle 
specific substrates for the enzyme. 

Synthesis of Peptide Compounds. 

Preparation of Crown Assembly 

The peptide compounds were synthesised in paraUel fashion using FmocRink-DAA^DA 
denvatised gears (ex Chiron Mimotopes. Australia) loaded at approximately 1.6 ner 
crown. Prior to synthesis each crown was connected to its respective stem and slotted into 
the 8 X 12 stem holder. Coupling of the ammo acids employed standard Fmoc amino acid 
cnemistry as described in >Solid Phase Peptide Synthesis'. E. Athenon and R.C. Shept,ard 
IRL Press Ltd, Oxford, UK, 1989. 
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Removal of N -Fmoc Protection 



A 250 mJ solvem resistant bath is char-ed with ''On ml ono. • 

assembly was removed and ^v-cc c . minutes. The 

r^movea and excess solvent removed bv brier- shakmo ti,» 
washed consecunvely with (200 ml each) DMF C^' ' " ""'^ 

1';: - 

^ . . spectrometer at a wavelen«^ of ^gonm a rn/ 

.V • ^ "leoretical Abs29o = 0.6. and this value comDared with the acn,.i 

Coupltag Of Standard Amino Acid R«lduB 

coupli.; r """^ '^Mon. required during a panicular round of 

"upang. Gear (approx 1.6 „oi=, snoda^d ccnpiings „,r=p=,f.„ned in DKff (300 ^ 

Coupling Of an An>in<^acid Residuo To Appropriate Wdl 



SUBSTITUTE SPECIHCATION 



16 



Whilst the multipin assembly is drying, the appropriate N-Fmoc amino acid pip esters (10 
equivalents calculated from the loading of each crown) and KOBt (10 equivalents) reauired 
for the panicuiar round of coupling are accurately weighed into suitable containers. 
Alternatively, the appropriate N-Fmoc amino acids (10 equivalents calculated from the 
loading of each crown), desired coupling agent e.g. HBTU (9.9 equivalents calculated from 
the loading of each crown) and activation e.g. HOBt (9.9 equivalents calculated from the 
loading of each crown). NMIvI (19.9 equivalents calculated from the loading of each 
crown) were accurately weighed into suitable containers. 

The protected and activated Fmoc amino acid derivatives were then dissolved in DMF (300 
1 for each gear e.g. for 20 gears, 20 .x 10 eq. x 1.6 ^moles of derivative would be dissolved 
in 10 ml DMF). The appropriate derivatives were then dispensed to the appropriate wells 
ready for commencement of the 'coupling cycle'. As a standard, coupling reactions were 
allowed to proceed for 6 hours. The coupled assembly was then washed as detailed below. 

Coupling of d-Biotin acid to pins 

d-Biotin (lOeq), 1 -hydroxybenzotriazole-H^O (lOeq), BOP (9.95eq) and NMM (19.9eq) 
were dissolved in DMF (0.3mL per well) and agitated for 2 minutes. 300 of solution 
was dispensed to each weU of a 96.well polypropylene plate. The gears were then added to 
the solution and left for 24 hours. Fresh solution was made up. the gears washed in DMF 
for 5 minutes and then added to the fresh coupling mixture and left a fiirther 24 hours. 

The pm assembly was removed from the plate, shaken free of excess liquid then immersed 
in DMF (200mL) for 5mins. The assembly was again shaken then immersed in MeOH 
(200mL, 3 X 5mins) and. allowed to air dry. 
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Washing Following Coupling 



If a 200/0 piperidine/DMF deprotcction is to immediarely follow the couoiin. cvcle then the 
multzpm assembly is briefly shaken to remove excess solvent washed consecutivelv with 
(200 tnl each), MeOH (5 minutes) andDMF (5 minutes) and de-orotected (see 6 ^) 'if th- 
multzpm assembly is to be stored or reacted further, then a full washm. cycle consisti.. 
bner shaking then consecutive washes with (200 ml each). DMF (5 minutes) and MeOH (5 
minutes, 5 minutes, 5 minutes) is performed. 

FoUowmg these general methods, the peptide libraries shown in Fi.^e 1 we« sequentiallv 
assembled by applying the appropriate coupling procedure at the correct cycle during 
synthesis. 

Acidolytic Mediated Cleavage of Peptide-Pin Assembly 

Acid mediated cleavage protocols were strictly performed in a fume hood. A polvstvrsn- 
96 well plate (1 ml/well) was labeUed, then the tare weight measured to the nea^esi m^^ 

Appropnate wells were charged with a trifluoroaocticacid/triisoproDvIsiiane (95:5 v/v 600 
Ml) cleavage solution, in a pattern corresponding to that of the multipin assemblv to be 
cleaved. 

The multipin assembly is added, the entire construct covered in tin foil and left for 2 hour^ 
The multipin assembly in then added to another polystyrene 96 well plate (1 ml/weU) 
contaimng trifluoroacetic acid/triethylsilane (95:5. v/v, 600 jxl) (as above) for 5 mmut-s 



Work up of Cleaved Peptides 
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•a. Primal polys:yr=ne cl=a,age p,„= (2 hour cleavage) and secondary ^^y,^. 
P.a.e ^^^^ ^ ^^^^ ^ ^ ^^^^^ 

(imnrnium oryrns rate) for 90 minutes. 

The contents of the secondary polystyrene plate were transferred to their cotrespondin. 
wells on the prnnaty.plate using an acetonitrii.Wr/acetic acid (50:45:5, v/v/v) sclutio:" 
(J X 1^0 ^1) and the spent secondary plate discarded. 



Analysis of Products 



A 5..L aliquot fron, each we,, , diluted .o ,00 m with 0.,% a,. TFA. then a lOuL a,i=»„ 
from .h.3 plate diluted w,m a ,00 m 0A% a,. TFA. The double dUuted plate was 

analysed by HPLC-MS. 

Final Lyophilisation of Peptides 

The plate was covered with tin foU, held to the plate with a. elastic band. A pin prick was 
P aced the foil direcUy above each well and the plate placed at .80«>C for 30 tninutes Th- 
plate was then lyophilised on the <Heto freeze drier' overnight. FinaUy. the dried plate ^as 
weighed. The total cleaved pepttde was quantified (by weight) and the averaee content of 
each peptide calculated. Since all the peptides present have originated from the san.- 
pept:de.pua assembly, cleaved under identical conditions, it is reasonable to assume that th- 
contents of each well are roughly equimolar. 

Protein kinase cloning, expression and purification 
Polymerase chain reaction (PGR) and downstream cloning 
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The coding sequence for human ZAP-70 amino acid 306-615 was amplified from Jurkat T 
eel! cDNA by PGR (2 minutes at 94=C. followed by 35 cycles of 15 seconds 94°C, 30 
seconds 65»C, 2 minutes 72«>C and a final single 5 minute 72°C incubation) using' the 
primers: 



5' 



CCGGGATCCGCCATGCCCATGGACACGAGCGTGTAT 3' [SEQ ID No. 5] 

5' GGGGGATCCTCAGTGGTGGTGGTGGTGGTGGGCACAGGCAGCCTCAGC 
CTTCTGTGT 3' [SEQ ID No. 6] 

The PGR amplicon was cloned into the Bam HI site of pUC19 and sequence connrmed 
using MI3-20 and reverse primers on an Applied Biosystems Prism 310 sequencer as 
described by manufacturer. The Bam HI ZAP-70 insert was excised from the sequencing 
vector and Ugated into the Baculoviral transfer vector pAcUWS 1 (Pharmingen). 

Generation of ZAP-70 enzyme using baculovirus 

Homologous recombination with wild type baculoviral DNA was then performed in Sf9 
insect cells and viral supernatant harvested. Plaque purified virus was exposed to se^•er2l 
viral amplification steps then used at a titre of 3x10" PFU/ml to infect 3 1 lxlO« cells/ml Sf9 
cells at an MOI of 10 in an Applecon bioreactor using 60% dissolved oxygen. Cells were 
harvested 3 days post infection. 

Protein purification 

The infected cell peUet was lysed in 50 mM Tris-HCl, pH 7.5, 0.15 M NaCl, 25% sucrose. 
1 mM 4-nitrophenol phosphate. 1 mM sodium orthovanadate and protease inhibitors. 



SUBSTITUTE SPECIRCATION 



20 



Following homogenisation using a dounce pestle B. the cleared lysate was loaded onto a 
cobalt-sepharose column. After column washing with lysis buffer, elution was performed 
with an imidazole gradient and ZAP-70 fractions identified by protein kinase activity 
agamst the peptide substrate Lys-Lys-Lys-Lys-Ala-Asp-Glu-Glu-Asp-Tyr-Phe-Ile-Pro-Pro- 
Ala as described in Casnellie et al, 1991 



Library Screening 

Library peptides were phosphoiylated in pools of 20 peptides at a final concentration of 1 
m total peptide in 50 mM HEPES, pH 7.5. 0.1o/„ Triton X-100, 100 uM ATP, 10 mM 
MnCh, 1 mM DTT and 0.2 mM sodium orthovanadate for 30 minutes at SOX. These 
reaction mixtures were then stopped using 100 mM EDTA 6 mM adenosine, transferred to 
strepavidin-coated microtitre plates and allowed to bind for 30 minutes at 20°C. 
Unincorporated reaction products were washed from the plate using PBS/0.1% Tween 20 
then plate incubation performed with anti-phosphotyrosine antibody (Sigma mouse 
monoclonal clone PT66) in 2% BSA/PBS/Tween for 1 hour at 20''C. Unbound antibody 
was removed by plate washing using PBS/Tween then incubation peribtmed with rabbit 
anti mouse-HRP (Amersham) for a further 1 hour. HRP detection was then performed with 
tetramethylbenzidine (1MB) substrate and measuring absorbance at 450 nM usmg a 
spectrophotometer (Dynex MUX). 

The best substrates were identified as those which gave the highest amount of phosphate 
mcoiporation. The library subsets were deconvolved according to the teaching of 
W097/42216: this gives an immediate determination of the unique sequence of^any 
Phosphorylated motif without the need for further synthesis or sequencing. (Figure -> fSEO 
ID Nos. 7,8,9,10]). 
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Peptide K„ determinatioa 



Bionn-tagged peptides were phosphorylated at varying concentrations in 50 mM HEPES 
PH 7.5. 0.m Triton X-100. 200 m ATP. 10 mM MnCl. 1 mM DTT and 0.2 mM sodiun.' 
orthovanadate using 0.5 ^Ci "P-y-AIP for 10 minutes at 300C. The reactions were stopped 
usmg 2 M guanidine hydrochloride, diluted 1 in 1 0 in water then 5 ,1 reaction spotted 
onto SAM titre plates (Promega). Unincoipo^ted reaction products were washed as 
descnbed by manufacturer then 20 m1 scintiUation liquid added and plate counted on a 
Packard TopCount beta-counter. K„ was calculated usmg a non-linear one site hyperbola 
model (ATP was added, in excess to negate influence of the ATP binding sue on the 
substrate site kinetics). (Figure 3). 

Example 2 

In this example the invention was used to map the active catalytic site of Svk a protein 
kinase enzyme that catalyses the phosphorylation of a tyrosine residue. The example 
Illustrates the positional scanning of the sub-set libraries for the mapping of the enzyme 
and their use so as to allow the subsequent identification of the preferred substrates of the 
enzyme catalytic site. 

The mapping and assessment of the catalytic site was perfonned as detailed in Example 1 
The substrate preferences were deconvoluted as detailed in WO 97/42216 and are detailed 
below. 



Library Sub-Set 1 
Library Sub-Set 2 



Asp-Glu-Glu-Asp^Tyr [SEQ ID No. 11] 

Asp-Glu-Giu-Tyr-Asp [SEQ ID No, 12] 



SUBSTITUTE SPECIFICATION 



Library Sab-Set 3 Asp-Glu-Tyr-Glu-.^^sp [SEQ ID No 13] 

Library Sub-Ser 4 Asp-Tyr-Glu-Glu-Val [SEQ ID No 14] 

Libraxy Sub-Set 5 Tyr-Ser-Ile-Ile-Nle [SEQ ID No. 15] 

Example 3 

In this example the invention was used to map the active catalytic site of CSK. The subset 
horary was used to scan the enzyme active site so as to allow the subsequent ident:f,cation 
and synthesis of the preferred specific substrates for the enz^■me as listed below 



Library Sub-Set 1 Asp-Glu-Glu-Glu-Tyr [SEQ ID No. 1 6] 
Libraiy Sub-Set 2 Asp-Glu-Glu-Tyr-Phe [SEQ ID No. 1 7] 

Asp-Glu-T>T-His-.Asn [SEQ ID No. 18] 
Asp-Tyr-His-Leu-Phe [SEQ ID No. 19] 
Tyr-Pro-Ile-Glu-Val [SEQ ID No. 20] 



Library Sub-Set 3 
Library Sub-Set 4 
Librarv Sub-Set 5 



Example 4 



In this example the invention was used to map the active catalytic site of v-Abl. a protean 
kmase enzyme that catalyses the phosphor>aation of a tvTosine residue. For this enzvme 
only the Library Sub-Set 4 (i.e. X-Tyr-X-X-X according to SEQ ID No. 2) was scanned 
with the enzyme. The active site substrate recognition substrate for this enzvme for this 
Sub-Set was Serine-tyrosine-phenylalanine-histamine-glutamine [SEQ ID No. 21]. 
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Example 5 

Decon volution methodology from W097/42216. 

Libraries or sub-libraries are arranged as two orthogonal sets of mixtures of compounds in 
solution providing two complementary combinatorial libraries indexed in two dimensions for 
autodeconvolution. These are referred to as primary and secondary libraries. 

The general concept of two onhogonal sets of mixtures indexed in two dimensions can be 
applied to various pennutations of numbers of wells, plate layout, number of permutations 
per mixture etc. However, according to the invention the numerical interrelationship is 
defined as indicated below for libranes containing compounds with four variable groups B. 
C, D and E. 

General Deconvolution Formulae 
-Bb-Cc-Dd-n(Ee)- (I) 

1 ) Primary and Secondary plates preferably have the same number of compounds per 
well [X]: otherwise there are two values, having Xp and Xs respectively. 

-) The primary library comprises [np] plates. 

If Rp X Cp=Rs X Cs, then the number of plates in the secondary library is also [np]. If not, the 
number of plates in the secondary library [ns] is: 

ns = Rp X Cp X np 
Rs X Cs 

e.g., a primary library of np = 4. Rp = 8, Cp = 10 can be set out in an Rs = 4, Cs = 5 
secondary library with the number of plates equal to: 
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ns = 8x10 X np 
4x5 



= 16 plates. 

Number of compounds per well 
-Bb-Cc-Dd-np(Ee)- (1) 

Number of possible combinations [k] is given by: 
k = b X c x d X np X e (2) 

When number of wells on a plate = [N], number of compounds per well = [X] and number of 
plates = [np]. 

k = X X N X np (3) 

However, number of wells [NI is also defined by the number of rows [Rp] and number of 
columns [CpJ. 

N = Rp X Cp (4) 



Combining (3) and (4). 

k = X X Rp X Cp X np (5) 
Combining (2) and (5) 

b X c X d X np X e = X X Rp X Cp \ np (6) 

Cancelling [np] from both sides of the equation: 
b X c X d X e = X X Rp X Cp (7) 

Two of the variables (e.g., b and c) on the left side of the equation must each be equal in 
number to the number of columns [Cp], whilst a remaining variable (e.g., d) on the left side 
must be equal in number to the number of rows [Rp]. So: 
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[Cp]- x Rp X e = X X Rp x Cp (S) 

Cancelling [Cp] and [Rp] from both sides of the equation. 
Cpxe = X 

where [e] is the number of variants along a fixed row; and if Rp = Cp, then Rp x e = X 

Example for a 10 x 10 x 8 x 8 fonnat over 4 plates: 
np X e = 8 => e = 2 
10x2 = X 
X = 20. 



From an understanding of the general deconvolution formulae shown above, those skilled in 
the ^ wm readily appreciate that the advantageous results of self-deconvolution according to 
W097/42.16 are obtainable utiiismg a number of different arrangements of wells, plate ^ 

layouts, mixtures etc. 

The technique wil, be illustmed by refe,.„ce .o . ,nodel systen, for screen™, a prceasa „,m 
a two complementary compound libraries. LI and L2, each contain n x ,600 c'ompounds of 
Ihe type A-B,.,<, -C,.,„-.,.D,^,n(E,.,) -F-G [nj. in which. 

A = a fluorescor internally quenched by F. preferably an unsubstituted or substituted 
anthramlic acid denvative, connected by-an am.de bond to B B. C. D. E at, natural 
unnatural amino actd res.dues connected together by suitable bonds, although B, C. D and 

can be any set of groups. 



or 

E 



F = a quencher capable of intentally quenching the fluotescor A, p„fe,ably an unsubstituted 
or substituted 3-nitrotyrosine derivative. 
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aqueo„s so,ob,„.y. Also, G should „o. be a subs^e for any ,ype of enzyme 
n - any integer between 1 and 4 inclusive. 

The numbers represented in subscript foUow,ng residues B, C, D and E refer to the nu.ber of 
PCS. ..es fro. wHicH those residues are selected. Thus. b. .a. of H.strat.e e Jpt r 
-C-D-E..,-F-G represents a mixture of the following ten compounds. 

A-Bi-C-D-E,-F-G 

A-B2-C-D-E,-F-G 

A-B3-C-D-E,-F-G 

A-B4-C-D-E,-F-G 

AB5-C-D-E,-F-G 

A-B,-C-D-E2-F-G 

A-B2-C-D-E2-F-G 

A-B3-C-D-E2-F-G 

A-B4-C-D-E2-F-G 

A B3-C-D-E2-F-G 

The general combinatorial formula for each library can be expressed as: 

Ai-B,o-C,o-D8-n(E2)-F,-G, (ni) 

providing 1 X 10 X 10 X 8 X n X 2 X ix 1 = 1600n is compounds. 

Both compound libraries, LI and L2, of the above type are synthesized usin. solid phase 
echni,ues using the Multipm approach" such that each library contains 1600n compounds as 
son mixtures of 20 distinct, identifiable compounds. These 20 component mixtures are thtn 
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placed separately into each of 80 wells of a 96 well plate (the other two lanes are used for 
control experiments) and then screened against a known quantity of the protease. 

Thus it is an important that regardless of the number of compounds contained in the two 
hbranes Ll and L2 (e.g., in the preferred embodiment 1600n, where n = any mteger between 
1 and 4) the libraries themselves are complementary and amenable to deconvolution wuhout 
recourse to resynthesis. 

The general description of the library layout wUl now be descnbed with reference to Figures 
4 to 17 which exemplify component distributions in the plates of a library matrix: 

For example, when n = 1 and the library contains 1600 compounds, in the first column of the 
first row (Al) (Fig. 4) in the first plate (PI) of the library Ll, (hereinafter designated as 
location Al, PI, Ll) there will be one C component (C.), one D component (D,), the ten B 
components, and the two E components (E, and E3) (Fig. 5). In the tenth column of the first 
row (AlO) in the fir.t plate (PI) of the library Ll (hereinafter designated as location AlO PI 
Ll), there will be one C component (C.o), one D component CD,), the ten B components and 
the two E components (E. and E,). In the tenth column of the e.ghth row (H,o) in the first 
plate (PI) of the library Ll (hereinafter designated as location HIO, PI. Ll). there will be one 
C component (C,o) one D component (Ds). the ten B components, and the two E components 
(E. and E,). Hence all 1600 components are present in the one plate, because the 80 wells 
each contain 20 components. 

A second complementan. library is synthesised as follows (Fig. 6). In the first column of the 
first row (Al ) of the first plate (PI) of the library L2 (hereinafter designated as location Al 
PI, L2). there will be ten C components, two D components (D3 and D.). one B component 
(B,). and one E component (E.). In the tenth column of the first row (AlO) of the first plate 
(PI) of the library, L2 (hereinafter designated as location AlO. PI, L2), there will be ten C 
components, two D components (D, and D3), one B component (B,o). and one E component 
(E,). In the hrsi column of the second row (Bl) of the first plate (PI) of the library L'' 
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(hereinafter designated as location Bl. Pi there will r 

'"^"^^ ten C components two D 

I" .he firs. co,u^ o, *e fe, (^i, „f ,He second p,a,e (P2) of ,he i.bra^ L2 ,he.,„af.er 
designated as location AI, L^l h. r- inereinotter 

and Da on. R components, two D components (D, 

-" D.). one B cotnponent (B,,. and one E component (E„ ,Hg. 7). In the tenth column of 
«.e f.rst (MO, Of t.e second p,a,e (P2, of the l.bta^, u, (het^inafter des,=nated as 

™pone t B,.), and one E component (E,). In the fit., column of the second row ,B,) of 
* s.ond Plate CP:) of the „ U (he.l„after designated as location B I, P2. U) t el 

L. (BIO. P2, L2,. thete w,l, be ten C components, two D components (D3 and D., 
one B component (B,o,. and one E component (E,. Hence onl, the fit., two tow are as^d to 
accommodate 400 compounds in total. 

In the nrs. column ofthe first row ,AI) of the thtrd plate (P3) of the librarv L2 (hereinafter 
designated as ocation A 1 t 7^ nu . ^- uicremarier 

and D.) one n " "^""POnents, two D components (D, 

thefirstrow(AIO)ofthethirdplate(P3)ofthelibn,rvl-7,k ■ ■ °lumn ot 

AlO P3 L^Wh. MK ' '^°'''"^""'="=™'«.-''«ignated as location 

AlO, P3, L.), there .,11 be ten C components, two D components (D, and D.) one B 
componet,3,,,„,„„,,„„^_ 

^et, rd Plate (P3,ofthe,ibrar.U(he..i„after designated as ,ocatlonBI.P3.L3,ther:l^ 

r:o'e::rr:'""~'^''''^ 

L. (BIO. P3. L2,. thete w,„ be t» C components, two D components (Ds and D„ 
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one B component (B,o), and one E component (.E3). Hence only the first two rows are used to 
accommodate 400 compounds in total. 

In the first column of the first row (Al) of the fourth plate (P4) of the library L2 (hereinafter 
designated as location Al. P4, L2), there will be ten C components, two D components (D7 
and Ds), one B component (B,). and one E component (E.) (Fig. 9). In the tenth column of 
the first row AlO) of the fourth plate (P4) of the library L2 (herdnafter designated as location 
AlO, P4. L2), there will be ten C components, two D components (D7 and Dg). one B 
component (B,o). and one E component (E,). In the first column of the second row (Bl) of 
the fourth plate (P4) of the library L2 (hereinafter designated as location is Bl, P4, L2), there 
will be ten C components, two D components (D. and D^), one B component (B,), and one E 
component (E,). In the tenth column of the second row (BIO) of the fourth plate (P4) of the 
library L2 (BIO, P4, L2) there will be ten C components, two D components (D. and Ds), one 
B component (B,o), and one E component (E^). Hence only the first two rows are used to 
accommodate 400 compounds in total. 

In this fashion two complementary libraries, Ll and L2 are prepared. In library LI. each of 
the 80 of wells contains a mixture of 20 components providing 1600 compounds for 
screening. In library L2, four plates are used ,n which only the first two rows are employed, 
providing 20 wells of 20 components per well per plate, and furnishing the same 1600 ' 
compounds as are present m library Ll, but in a format in which no two compounds found 
together in library Ll will be found together in library L2. 



are 



Thus it is important that the compounds contained in the two libraries Ll and L2 ... 
themselves complementary, in that any two compounds which are found together in a 20 
component mixture in the same location (e.g., Al. PI. Ll) in library Ll. are not found 
together in any of the 20 component mixtures in any location of the library L2. 
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Thus, for example, with ^ference ,o ,he prin,ary library P, LI of F,g„^ 3 and the secondary 
hbrories PI L2. P2 L2, P3 L3 and P4 L2 of Figures 6-9 ,c is possible ,o deconvolu,e an ' 

exemplary sequence. 

-B2-C3-D4-E, 

If the Lbrary ,s a FRET library and this sequence is a substrate fluorescence will occur in Pi 
LI at C3D4 This gives the information that the substrate is: 

7-C3-D4-? 

If fluorescence occurs in P2 L2 at B:E,, it indicates a substrate: 

-B2-?-?-E,- 

The confirmation of the substrate as: 

-B2-C3-D4-E, 

Should be provided by non-fluorescence of PI L2, P3 12 and P4 L2 which all contain -B.-C^- 
X-Ej. where X is not D. ' ^ 

In practice i, is likely ,ha, more man one sequence will resul, in a subscrace. Informacion as ,o 
which positions B-C-D-E- are sensitive to change (i.e., ,.,uire a specific .roup, and wh.ch 
are .nsensitive (i.e., can tolerate more than one cho.ce of group) in the context of the whole 
sequence gives valuable SAR data which can be used to model and/or svnthes.se related 

compounds. 

In analogous examples, where separately n = 2, 3 or 4, extra plates are constructed in library 
LI format to accommodate the component pait. E3 and E. („ = 2), E< and E« (n = 3) and E- 
and E, (n = 4). respectively. For the respective deconvolution libraries of the type L' the ' 
respective rows in the plates PI, P2. P3, and P4, are incteasingiy filled with the patted 
components D, and D^, D, and D. and D, and D.. and D, and D., respectively 
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For example, when n = 3, and the library contains 4800 compounds, in the first column of the 
first row (Al) in the first plate (Pi) of the library LI (heremafter designated as location AL 
PI, LI) there will be one C component (C,), one D component (D,), the ten B components, 
and the two E components (E, and E,). In the tenth column of the first row (AlO) in the first 
plate (PI) of the library LI (hereinafter designated as location AlO, PI, LI) there will be one 
C component (C,o). one D component (D,), the ten B components, and the two E components 
(E, and E2). In the tenth column of the eighth row (HIO) in the first plate (PI) of the library 
LI (hereinafter designated as location HIO. PI, LI) there will be one C component (C,o), one 
D component (Dg), the ten B components, and the two E components (E, and E,). Hence 
1600 components are present in the one plate, because the 80 wells each contain 20 
components. 

In the first column of the first row (Al) in the second plate (P2) of the library LI (hereinafter 
designated as location Al. P2, LI) there will be one C component (C,), one D component 
(D,), the ten B components, and the two E components (E, and E4). In the tenth column of 
the first row (AlO) in the second plate (P2) of the library LI (hereinafter designated as 
location AlO, P2, LI) there will be one C component (C-.o). one D component (D,), the ten B 
components, and the two E components (E3 and E4). In the tenth column of the eighth row 
(HIO) in the second plate (P2) of the library LI (hereinafter designated as location HIO, PI, 
LI) there will be one C component (C,o), one D component (Dg), the ten B components, and 
the two E components (E3 and E4). Hence 1600 components are present in the one plate, 
because the 80 wells each contain 20 components. 

In the first column of the first row (Al) in the third plate (P3) of the library LI (hereinafter 
designated as location Al, P3, LI), there will be one C component (C,). one D component 
(D,), the ten B components, and the two E components (E5 and E^). In the tenth column of 
the first row (AlO) in the third plate (P3) of the library LI (hereinafter designated as location 
AlO, P3, LI) there will be one C component (C,o), one D component (D,), the ten B 
components, and the two E components (E5 and Ee). In the tenth column of the eighth row 
(HIO) in the third plate (P3) of the library Ll (hereinafter designated as location HIO. P3, LI) 
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there will be one C component (C,o), one C component (C,), the ten B components, and the 
two E components (E^ and E,). Hence 1600 components are present in the one plate, because 
the 80 wells each contain 20 components. In total the three plates, PI, P2 and P3, contain 
1600 compounds/plate 4800 compounds in total. 

For example, when n = 4, and the library contains 6400 compounds, in the first column of the 
first row (Al) in the first plate (PI) of the library LI (hereinafter designated as location Al, 
PI, LI) there will be one C component (C.). one D component (D,), the ten B components ' 
and the two E components (E, and E,) (Fig. 10). In the tenth column of the first row (AlO) m 
the first plate (PI) of the library LI (hereinafter designated as location AlO, Pi, LI) there will 
be one C component (C,o), one D component (D,), the ten B components, and the two E 
components (E, and E,). In the tenth column of the eighth row (HIO) in the first plate tPl) of 
the library LI (hereinafter designated as location HIO, PL LI) there will be one C component 
(C,o). one D component (Ds), the ten B components, and the two E components (E, and E,). 
Hence all 1600 components are present in the one plate, because the 80 wells each contain 20 
components. 

In the first column of the first row (Al) in the second plate (P2.) of the library LI (hereinafter 
designated as s location Al. P2, LI) there will be one C component (C,), one D component 
(D,), the ten E components, and the two E components (E3 and E4) (Fig. 1 1). In the tenth 
column of the first row (AlO) in the second plate (P2) of the library LI (hereinafter 
designated as location AlO, P2, Ll) there will be one C component (C,o), one D component 
CD,), the ten B components, and the two E components (E3 and E4). In the tenth column of 
the eighth row (HIO) in the second plate (P2) of the library Ll (hereinafter designated as 
location HIO, P2. Ll) there will be one C component (C.o). one D component (D8). the ten B 
components, and the two E components (E3 and Ej). 

In the first column of the first row (Al) in the third plate (P3) of the library Ll (hereinafter 
designated as location Al, P3. Ll) there will be one C component (C,), one D component 
(D,), the ten B components, and the two E components (E5 and Eg) (Fig. 12). In the tenth 
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column of the first row (AlO) in the third plate (P3) of the library LI (hereinafter desianated 
as location AlO. P3, LI) there w.Il be one C component (C.o). one D component (Dofthe ten 
B components, and the two E components (E5 and E^). In the tenth column of the eighth row 
(HIO) in the third plate (P3) of the library LI (hereinafter designated as location HIO, P3, LI) 
there will be one C component (C.o), one D component (Dg), the ten B components, and the 
two E components (E5 and E^). 

In the first column of the first row (Al in the fourth plate (P4) of the library Li (hereinafter 
designated as location Al. P4, LI) there will be one C component (C.). one D component 
(D,), the ten B components, and the two E components (Et and Eg) (Fig. 13). Likewise, in the 
tenth column of the first row (AlO) in the fourth plate (P4) of the library LI (hereinafter 
designated as location AlO. P4, LI) there will be one C componem (C.o). one D component 
(D,), the ten B components, and the two E components (E. and Eg). In the tenth column of 
the eighth row (HIO) in the fourth plate (P4) of the library LI (hereinafter designated as 
location HIO. P4, LI) there will be one C component (C.o), one D component (Ds). the ten B 
components, and the two E components (E7 and Eg). 

A second complementary library is synthesised as follows. In the first column of the first row 
(Al) of the first plate (PI) of the library L2 (hereinafter designated as location Al. PI L2), 
there will be ten C components, two D components (D, and D,), one B component (B,). and 
one E component (E,) (Fig. 14). In the tenth column of the first row (AlO) of the first plate 
(PI) of the library L2 (hereinafter designated as location AlO, PI, L2), there will be the ten C 
components, two D components (D, and D.). one B component (B,o). and one E component 
(E,). In the first column of the eighth row (HI) of the first plate (PI) of the library L2 
(hereinafter designated as location Hi. PI, L2). there will be the ten C components, two D 
components (D, and D3). one B component (B,). and one E component (Eg). In the tenth 
column of the eighth row (HIO) of the first plate (PI) of the library L2 (hereinafter designated 
as location HIO. PI, L2) there will be the ten C components, two D components (D, and D,). 
one B component (B,o), and one E component (Eg). Hence the matrix containing all ten 
columns and all eight rows are used to accommodate 1600 compounds in total. 
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In .he nrsr column of ,he firs, row (A 1 ) of .he second place (P2) of >he libmrv L2 (hereinafter 
des,gna,ed as loca.ion Al. P2, L2,, there wil, be cen C components, two D components (D- 
and DJ. one B component (B,). and one E componem (E,) (Fig. ,5). In the tenth column of 
the first row (A 10) of the second plate (P2) of the library L2 (het.inaft.r designated as 
location AlO, P2. L2), the,, will be ten C components, two D components (D3.and D., one B 
component (B,„). and one E componem (E,), In the first column of the second row (B 1) of 

.he second plate (P2) of the library L2(he,ema,ter designated as location Bl L^) there 
wll be ten C components, two D components (D3 and D,,, one B component (B,). and one E 
componem (E,), In the tenth column of the eighth row (HlO) of the second plate (P-) of the 
library U (hereinafter designated as location HlO, P2, L2). there will be ten C components 
.wo D components (D, and D,), one B component (B ,„). and one E component (Es), 

In .he firs, column of the first row (A 1) of the third plate (P3) of the libt^rv- U (hereinafter 
designated as locauon Al. P3. L2), there wil, be ten C components, two D components (D5 
and D,,, one B component (B,) and one E component (E,) (Fig, 16), In the tenth column of 
.he firs. „,w (AIO) of *e Uiird plate (P3) of the libt^y L2 (hereinafter designated as location 
AlO. P3. U), there will be ten C components, two D components (D3, and D.), one B 
component (B,„), and one E component (E,), In the first column of the second row (B 1 ) of 
*e third plate (P3) of the library L2 (hereinafter designated as location B I , P3, L.) there will 
be ten C components, two D components (D, and D.), one B component (B,), and one E 
component (E,), In the tenth column of the eighth row (HlO) of the third plate (P3) of the 
Lbrary L2 (hereinafter designated as location HlO. P3, U), then= will be ten C components 
two D components (D, and D,), one B componen. (B,„). and one E component (Es), 

In the firs, column of the first row (A 1) of the fourth plate (P4) of the librars- U (hereinafter 
designated as location Al, P4, L2). there will be ten C components, two D components (D, 
and D,). one B componen, (B,), and on. E componen. (E.) (Fig, 17), In the tenth column of 
.he firs, row (AID) of .he fourth plate (P4) of the library U (hereinafter designated as 
location AlO, P4. L2). there will be ten C components, two D components (D, and D.) one B 
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component rB,o), and one E component (E,). In the first column of the second row (B 1) of 
the tourth plate (P4) of the library L2 (hereinafter designated as location B 1. P4, L2). there 
will be ten C components, two D components (D, and Ds). one B component (B,). and one E 
component (E^). In the tenth column of the eighth row (HlO) of the fourth plate (P4) of the 
library L2 (hereinafter designated as location HlO, P4, L2), there will be ten C components 
two D components (D, and Ds). one B component (B.o), and one E component (Es). 

The strategy is thus based on the synthesis of two orthogonal sets of mixtures in solution 
These solutions are each indexed in two dimensions. Thus the data from a scan identifies the 
most active compounds without the need for decoding or resynthesis. 

The positional preferences of sub-units (in this case amino acds) ore optimised with respect 
to all other variant positions simultaneously. The synergistic relationship between all four 
positions is realised and both positive, beneficial and negative, deactivating data are 
generated. This leads to families (sub-populations) cf. substrates and their sub-unit 
preferences. The data can be fed into molecular modelling programs to venerate 
pharmacophoric descriptors that encompass both the desirable features (from the positive 
data) and indicate undesirable interactions (from the negative data sets). 

Note that a one dimensional scan only indicates one position at a time as 'most active' and 
does not explore the synergistic relationship between positions. 

The general methodology exemplified above with regard to the use of complementarv 
combinatonal FRET libraries for the identification of proteolytic enzyme substrates is 
equally applicable for identification of compounds from a library which interact with another 
active moiety. 

Combinatorial libraries of compounds containing four variable groups B. C, D and E can be 
produced an interactions with active moieties detected using suitable reporters or markers. 
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